Top - K Answering under Uncertain Schema Mappings

نویسندگان

  • SHANXIAN MAO
  • Longzhuang Li
  • Ahmed M. Mahdy
  • Dulal C. Kar
چکیده

The data sources of information systems running on various hardware and software platforms are independent to each other and mutually closed, which makes data exchange difficult. With the evolvement of the information application technology, data sharing between internal departments or external enterprises is necessarily required. Finally, data integration has been developed. The data integration is an application providing a bridge of communication between isolated sources and offering a platform for information exchange. However, due to the need of markets nowadays, the big-data sources become one of main burdens on the transaction rates for data integration systems. There are two semantics, by-table and by-tuple, which are developed to capture top-k answering in the data integration system. Both semantics are developed to attempt to enhance the performance when the system encounters uncertain queries or obscure schema mappings between local sources and their centralized system. However, although the current algorithms support some features to capture accurate top-k answering and try to avoid accessing all data from sources, they cannot effectively minimize the number of traversed items in most cases. Consequently, we are trying to propose our solutions to improve the efficiency for the data integration with uncertainty. In our research, we apply histogram-based approximation to capture an estimated list of top-k results in order to improve the ability of processing a large amount of data more efficiently. Histogram-based approximation is used to generate approximate values from histograms provided by sources, and the approximate values are summarized for iii

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Deliverable D2.1 First Year Report on Workpackage 2

Objectives of the WP. The key objectives were: 1. Develop new techniques and algorithms for schema mapping and Web data exchange; and 2. Develop new techniques and algorithms for improving query evaluation by making use of schema information. The first objective is represented by tasks T2.1 (Schema Mappings for XML) and T2.2 (Data Exchange for XML Documents), and the second by tasks T2.3 (Query...

متن کامل

Managing Uncertainty in Schema Matching with Top-K Schema Mappings

In this paper, we propose to extend current practice in schema matching with the simultaneous use of top-K schema mappings rather than a single best mapping. This is a natural extension of existing methods (which can be considered to fall into the top-1 category), taking into account the imprecision inherent in the schema matching process. The essence of this method is the simultaneous generati...

متن کامل

Three Easy Pieces on Schema Mappings for Tree-structured Data

Schema mappings specify how data organized under a source schema should be reorganized under a target schema. For tree-structured data, the interplay between these specifications and the complex structural conditions imposed by the schemas, makes schema mappings a very rich formalism. Classical data management tasks, well-understood in the relational model, once again become challenging theoret...

متن کامل

Graph Data Exchange with Target Constraints

Data exchange is the problem of translating data structured under a source schema according to a target schema and a set of source-to-target constraints known as schema mappings. In this paper, we investigate the problem of data exchange in a heterogeneous setting, where the source is a relational database, the target is a graph database, and the schema mappings are defined across them. We stud...

متن کامل

Query Processing under GLAV Mappings for Relational and Graph Databases

Schema mappings establish a correspondence between data stored in two databases, called source and target respectively. Query processing under schema mappings has been investigated extensively in the two cases where each target atom is mapped to a query over the source (called GAV, global-as-view), and where each source atom is mapped to a query over the target (called LAV, local-as-view). The ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012